Chapter 7-3. For one of the simulated sets of data (from last time): n = 1000, ˆp Estimating a population mean, known Requires

Size: px
Start display at page:

Download "Chapter 7-3. For one of the simulated sets of data (from last time): n = 1000, ˆp Estimating a population mean, known Requires"

Transcription

1 For one of the simulated sets of data (from last time): n = 1000, ˆp SE pˆ Fixed sample size, increasing CI 1001 z Interval 80% ,.5 90% ,.58 95% , % ,.543 Fixed CL at 95%, increasing sample size (assuming ˆp 0.50 every time, which is unlikely) Chapter 7-3 Estimating a population mean, known Requires SRS The value of the population SD is known The population is normally distributed or n > 30; method robust wrt this requirement meaning that as long as the population is unimodal and approximately symmetric with no extreme outliers, then the methods should work just fine Remember that n > 30 is just a guideline. If the underlying population is heavily skewed then a larger sample may be required. For many populations samples between 15 and 30 suffice Sample size SE p ˆ Interval , , , ,.564

2 Define population mean x sample mean usually provides the best point estimate of the population mean x is an unbiased estimator of For many populations x tends to be more consistent then distributions of other sample statistics Margin of Error Again, denoted E with probability 1, the maximum likely difference between the observed sample mean x and the true value of the population mean E z SD x z n Use this to construct a CI: x E x E x E x E, x E Interpreting a CI basically the same as for proportions Must be careful. We are 95% confident that the population mean falls in Not: there is a 95% probability that the population mean falls between Not: there is a 95% chance that the population mean falls between After you have used sample data to construct the interval, the interval either contains the truth or it does not. Confidence comes from the method/procedure.

3 Determining sample size We can use the expression for margin of error to help determine how large of a sample we need to obtain an estimate of with a particular level of confidence. E z E z n n z n E If we are in the process of designing a study, where do we get an estimate of x Practice 1 The Tyco Video Game Corporation finds that it is losing income because of slugs used in its video games. The machines must be adjusted to accept coins only if they fall within set limits. In order to set those limits, the mean weight of quarters in circulation must be estimated. A sample of quarters will be weighed in order to determine the mean. How many quarters must we randomly select and weigh if we want to be 99% confident that the sample mean is within 0.05 g of the true population mean for all quarters? Based on results from a sample of quarters, we can estimate the population SD as g. From a pilot study From the literature From an expert Range rule of thumb: Range 4

4 Section 7-4 Estimating a population mean, unknown Requires that sample is a SRS and that sample is normally distributed or n > 30 We will use s, the sample SD in place of, the population SD SDx n s Define SEx n SE has more uncertainty so we compensate by making the CI a bit wider, especially for small samples We will use the Student t distribution instead of the standard normal distribution unimodal, symmetric around zero, but has fatter tails t distribution varies for each value of n Must use degrees of freedom (df) to in addition to CL to find the critical value

5 For a confidence interval for a single mean based on a collection of sample data, the df = n - 1 df have to do with the number of sample values that can vary after certain restrictions have been imposed on all data values When estimating the mean from a sample we have the following situation: Say we have 5 sample values and we know that the mean is 50. This that the sum of the 5 sample values must be 150 (5 x 50). We can freely assign 4 of the sample values after which the 5 th value will be determined as 150 sum of first 4 values. Since 4 scores can be freely selected, we say there are 4 df = n - 1 The critical value is: E t t t,df and s n,dfse x,df Example Because cardiac deaths appear to increase after heavy snowfalls, an experiment was designed to compare cardiac demands of snow shoveling to those of using an electric snow thrower. Ten subjects cleared tracts of snow using both methods, and their maximum heart rates (beats per minute) were recorded during both activities. The following results were obtained: Manual: n 10; x 175; s 15 Electric: n 10; x 14; s 18 a) Find the 95% CI estimate of the population mean for those people who shovel snow manually. b) Find the 95% CI estimate of the population mean for those people who use the electric snow thrower. c) If you are a physician with concerns about cardiac deaths fostered by manual snow shoveling, what single value in the confidence interval from part (a) would be of greatest concern? d) Compare the confidence intervals from parts (a) and (b) and interpret your findings.

6 Using a CI to find the point estimate and the margin of error x Upper confidence limit Lower confidence limit E Upper confidence limit Lower confidence limit